UNHCR compiles official statistics on stocks and flows of forcibly displaced and stateless persons twice a year, once for mid-year figures (Mid-Year Statistical Reporting, MYSR) and once for end-year figures (Annual Statistical Reporting, ASR). For these reporting exercises, country operations compile aggregate population figures from a range of sources and data producers such as governments, UNHCR’s own refugee registration database proGres and sometimes non-governmental actors. The figures undergo a statistical quality control process at the country, regional and global level of the organisation and are disseminated on the publicly available refugee data finder (https://www.unhcr.org/refugee-statistics/) after applying statistical disclosure control to suppress very small counts of persons that could identify individuals.
The end-year figures compiled with reporting date 31 December contain sex- and age breakdowns of the stocks of displaced and stateless people under UNHCR’s mandate. Table @ref(tab:demref2020)) displays the sex- and age-disaggregated data on the stock of refugees under UNHCR’s mandate (including Venezuelans displaced abroad, excluding Palestine refugees under UNRWA’s mandate). The data is available by country of origin, country of asylum and within country of asylum on sub-national level as indicated by the variables location and urbanRural. Variable statelessStatus displays whether the reported population is stateless (“STL” and “UDN”) or not stateless (“NSL”). The variables [sex]_[agebracket] contain the counts of refugees as of 31 December 2020 in the individual sex and age brackets in the respective geographic/stateless combination. For example, female_12_17 contains the number of female refugees aged 12 to 17. Variable totalEndYear is the total number of refugees over all sex/age categories.
Pre-defined sex-specific age brackets are 0-4, 5-11, 12-17, 18-24, 25-49, 50-59 and 60 years and older. For some population groups, data is only available for the overall 18-59 age group instead of for the finer brackets in this age range. For others, only sex-disaggregated data without age information is available, and finally there are population groups for which only the total end-year count without any demographic information is available. These different levels of disaggregated data availability is recorded in variable typeOfDisaggregation in the dataset: “Sex/Age fine” for the most granular age brackets, “Sex/Age broad” for populations reported with the 18-59 age bracket, “Sex” where only counts of female and male refugees are available without age information and “None” for populations without any available demographic information.
t.typeOfDisaggregation
Table @ref(fig:t.typeOfDisaggregation)) shows the availability of sex/age-disaggregated data by the global number of refugees and countries of asylum. Age- and sex-disaggregated data is available for 75 per cent of the global refugee population and data disaggregated only by sex for a further 4 per cent.
UNHCR has in the past reported the sex/age breakdown in the available data as global and regional aggregates of the demographic distribution of all refugees. Figure @ref(fig:p.obsDemographicsBroad.short)) shows the proportion of female and male children and adults in the global refugee population with available demographic data, with 46 per cent children under the age of 18 and 49 per cent women and girls among the population.
Sex/age distribution of refugees with available data end-2020
In figure @ref(fig:p.obsDemographicsBroad.age)), we see the split between female and male refugees in each age bracket with available data, with a slight surplus of boys and men in all age groups up to 59 years and slightly more women than men among refugees aged 60 and older.
Sex distribution within age brackets of refugees with available data end-2020
By reporting the observed demographic distribution as the sex/age structure of the entire refugee population including the part without available data, we are assuming that the 25 per cent for whom no age information was available at the end of 2020 have the same distribution as the ones with available data. It is difficult to check this very strong assumption of ignorability of the missing data without further proxy information on the sex/age distribution in the missing part of the data.
Demographic disaggregation coverage by region of asylum
Figure @ref(fig:p.typeOfDissaggregationBroad.asyregionhcr)) shows for what proportion of the refugee population in each region sex- and sex/age-disaggregated data was available at the end of 2020. While demographic coverage is close to universal for refugees hosted in the Sub-Saharan Africa and the MENA regions,
Globally - % per age/sex cat. - % age missing - % age and sex missing
Demographic disaggregation coverage by region of origin
By CoO - % per age/sex cat. - % age missing - % age and sex missing
Demographic disaggregation coverage by origin country and asylum region
2020 end-year refugee/Venezuelan population by origin country and asylum region
By CoA - % per age/sex cat. - % age missing - % age and sex missing
Discuss types and reasons for missingness (NMAR) and outline modelling approach to overcome. Why is using available data so bad?